Gender Prediction for Chinese Social Media Data

نویسندگان

  • Wen Li
  • Markus Dickinson
چکیده

Social media provides users a platform to publish messages and socialize with others, and microblogs have gained more users than ever in recent years. With such usage, user profiling is a popular task in computational linguistics and text mining. Different approaches have been used to predict users’ gender, age, and other information, but most of this work has been done on English and other Western languages. The goal of this project is to predict the gender of users based on their posts on Weibo, a Chinese micro-blogging platform. Given issues in Chinese word segmentation, we explore character and word n-grams as features for this task, as well as using character and word embeddings for classification. Given how the data is extracted, we approach the task on a perpost basis, and we show the difficulties of the task for both humans and computers. Nonetheless, we present encouraging results and point to future improvements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-task Learning for Gender and Age Prediction on Chinese Microblog

The demographic attributes gender and age play an important role for social media applications. Previous studies on gender and age prediction mostly explore efficient features which are labor intensive. In this paper, we propose to use the multi-task convolutional neural network (MTCNN) model for predicting gender and age simultaneously on Chinese microblog. With MTCNN, we can effectively reduc...

متن کامل

Improving Gender Prediction of Social Media Users via Weighted Annotator Rationales

This paper proposes and contrastively evaluates several novel approaches to utilizing annotator rationales to improve the prediction of user gender in social media for English and Spanish. Our methods outperform state-of-the-art systems for Twitter gender prediction, and yield up to 28% error reduction relative to an otherwise identical system and training data without the use of annotator rati...

متن کامل

The Relationship between virtual social networks usage and gender role attitude in university students of Iran

Background & aim: Gender role attitude is one of the key issues affecting the stability of family foundation, which is under the influence of mass media. New media and the process of globalization are effective in promoting gender equality, and an example of modern media is virtual social networks. This study aimed to determine the relationship between virtual social media usage and gender role...

متن کامل

Age and Gender Identification in Social Media

This paper describes the submission of the University of Washington’s Center for Data Science to the PAN 2014 author profiling task. We examine the predictive quality in terms of age and gender of several sets of features extracted from various genres of online social media. Through comparison, we establish a feature set which maximizes accuracy of gender and age prediction across all genres ex...

متن کامل

Gender Prediction in Social Media

In this paper, we explore the task of gender classification using limited network data with an application to Fotolog. We take a heuristic approach to automating gender inference based on username, followers and network structure. We test our approach on a subset of 100,000 nodes and analyze our results to find that there is a lot of value in these limited information and that there is great pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017